Inverse Lyndon words and Inverse Lyndon factorizations of words
نویسندگان
چکیده
Motivated by applications to string processing, we introduce variants of the Lyndon factorization called inverse Lyndon factorizations. Their factors, named inverse Lyndon words, are in a class that strictly contains anti-Lyndon words, that is Lyndon words with respect to the inverse lexicographic order. We prove that any nonempty word w admits a canonical inverse Lyndon factorization, named ICFL(w), that maintains the main properties of the Lyndon factorization of w: it can be computed in linear time, it is uniquely determined, it preserves a compatibility property for sorting suffixes. In particular, the compatibility property of ICFL(w) is a consequence of another result: any factor in ICFL(w) is a concatenation of consecutive factors of the Lyndon factorization of w with respect to the inverse lexicographic order. As for the applications, experimental results on biological datasets shown that ICFL(w) combined with the Lyndon factorization is intermediate between the Lyndon factorization and the LZ factorization with respect to the size of the factors. Moreover ICFL(w) allows us to handle too long or too short factors in the Lyndon factorization.
منابع مشابه
Lyndon factorization of the Thue-Morse word and its relatives
Some attention has recently been given to the Lyndon factorization of infinite words [16], [10], [12]. These works are themselves related to the earlier works by Reutenauer [13] and Varricchio [17], concerned with unavoidable regularities and semigroup theory. The results we present here reinforce those in [10] and [12], and give an additional application of the general Lyndon factorization the...
متن کاملInfinite Smooth Lyndon Words
Motivation Outline Notation Lyndon words Smooth words Result Idea of the proof Case a) Case b) Case c) Case d) Open problems Motivation ◮ Lyndon words : class of words having lexicographical order properties. ◮ Smooth words : class of words, related to the Kolakoski word, that can be easily compressed. ◮ Some infinite smooth words are also Lyndon words.
متن کاملPrimitive Words and Lyndon Words in Automatic and Linearly Recurrent Sequences
We investigate questions related to the presence of primitive words and Lyndon words in automatic and linearly recurrent sequences. We show that the Lyndon factorization of a k-automatic sequence is itself k-automatic. We also show that the function counting the number of primitive factors (resp., Lyndon factors) of length n in a k-automatic sequence is k-regular. Finally, we show that the numb...
متن کاملLyndon Words and Singular Factors of Sturmian Words
Two diierent factorizations of the Fibonacci innnite word were given independently in 10] and 6]. In a certain sense, these factorizations reveal a self-similarity property of the Fibonacci word. We rst describe the intimate links between these two factorizations. We then propose a generalization to characteristic sturmian words. R esum e. Deux factorisations du mot de Fibonacci ont et e donn e...
متن کاملUniversal Lyndon Words
A word w over an alphabet Σ is a Lyndon word if there exists an order defined on Σ for which w is lexicographically smaller than all of its conjugates (other than itself). We introduce and study universal Lyndon words, which are words over an n-letter alphabet that have length n! and such that all the conjugates are Lyndon words. We show that universal Lyndon words exist for every n and exhibit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1705.10277 شماره
صفحات -
تاریخ انتشار 2017